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UNDERSTANDING ADMIXTURE FRACTIONS 

MASON LIANG AND RASMUS NIELSEN 

Abstract. Estimation of admixture fractions has become one of the 
most commonly used computational tools in population genomics. How- 
ever, there is remarkably little population genetic theory on their sta- 
tistical properties. We develop theoretical results that can accurately 
predict means and variances of admixture proportions within a pop- 
ulation using models with recombination and genetic drift. Based on 
established theory on measures of multilocus disequilibrium, we show 
that there is a set of recurrence relations that can be used to derive 
expectations for higher moments of the admixture fraction distribution. 
We obtain closed form solutions for some special cases. Using these re- 
sults, we develop a method for estimating admixture parameters from 
estimated admixture proportion obtained from programs such as Struc- 
ture or Admixture. We apply this method to HapMap data and find 
that the population history of African Americans, as expected, is not 
best explained by a single admixture event between people of European 
and African ancestry. A model of constant gene flow for the past 11 
generations until 2 generations ago gives a better fit. 



Introduction 

It is common in population genetic analyses to consider individuals as 
belonging fractionally to two or more discrete source populations. The pro- 
portion of an individual's genome that belongs to a population is called 
that individual's 'admixture fraction' or 'admixture proportion'. Programs 
such as Structure (Pritchard et al., 2000), Eigenstrat (Price et al., 2006), 
Frappe (Tang et al., 2005), or Admixture (Alexander et al., 2009) can jointly 
estimate these admixture fractions for multiple individuals in a sample, along 
with the corresponding allele frequencies in each of the source populations. 
These admixture fractions are often presented in a 'structure plot,' an ex- 
ample of which is shown in Figure 1. We will henceforth refer to these 
methods as 'structure analyses'. This approach has proven highly useful for 
understanding genetic relationships in many different species, e.g. humans 
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(Rosenberg et al., 2002), cats (Menotti- Raymond et al., 2008), or pandas 
(Zhang et al., 2007). Other analyses reconstruct admixture tracts for each 
genome in the sample, by inferring the local ancestry of every position, or 
window, in each sampled genome (Tang et al., 2006; Maples et al., 2013). In 
this context, the admixture fraction for a genome is the fraction of its total 
length that is inherited from a particular source population. 

Although structure analyses are not tied to any particular mechanistic 
model of population history and demography, the admixture fractions and 
admixture tracts are commonly interpreted to be the result of past admix- 
ture events in which modern populations were formed by admixture (or 
introgression) between ancestral source populations. The distribution of 
admixture tract lengths has been related to specific mechanistic models of 
admixture (Falush et al., 2003; Tang et al., 2006; Pool and Nielsen, 2009), 
and has been used to estimate times of admixture (Gravel, 2012). However, 
the admixture proportions themselves also contain information regarding 
admixture times. Following an admixture event, the variance in admixture 
proportions within a population will be high, but will thereafter decrease, 
and will eventually converge to zero in the limit of large genomes. The 
variance in admixture fractions among individuals contains substantial in- 
formation about the time since admixture that can be used in addition to 
the tract length distribution. In some cases, this may be more robust than 
inferences based on tract lengths, because the length distribution of tracts 
is often difficult to infer, and is often not modeled accurately by the hid- 
den Markov model (HMM) methods used to infer tract lengths (Liang and 
Nielsen, 2014). Even in cases where tract lengths can be accurately inferred, 
studies aimed at estimating admixture times should benefit from using both 
variance in admixture proportions among individuals and overall admixture 
tract lengths distributions. 

Verdu and Rosenberg (2011) developed a method for computing moments 
of admixture proportions in a model in which admixed population is formed 
as a mixture between multiple source populations, allowing for arbitrary 
gene- flow from the source populations over a number of generations (</). 
They establish recursions for the moments of the admixture fractions and 
use these equations to determine how the mean and the variance changes 
through time in particular admixture scenarios. These moments are expec- 
tations for single individual's admixture fraction and are averaged over the 
possibile genealogical histories of the population. As a result, they can be 
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53 difficult to relate to data because replicates from multiple identical popula- 

54 tions rarely are available. In this paper, we consider a different problem, the 

55 problem of calculating sample moments for admixture proportions obtained 

56 from individuals in one population. 

57 We extend the model model in Verdu and Rosenberg (2011) to incorporate 

58 the effects of recombination and genetic drift by adding a a random union of 

59 zygotes component. Recombination is important because even if one half of 

60 a chromosome's ancestors are from the first source population, it is unlikely 

61 that exactly one half of that chromosome's genetic material is inherited 

62 from that population. Genetic drift is important because the individuals in 

63 a sample might share ancestors and, therefore, have more similar admixture 

64 fractions than expected by chance in a model without drift. The results 

65 developed in this paper should be directly applicable for quantifying the 

66 results of a structure analysis. 

67 The General Mechanistic Model 

68 We start by considering admixture fractions in haploid genomes. These 

69 haploid admixture fractions can later be paired up to create diploid admix- 

70 ture fractions. The admixture fraction of a (haploid) genome Hi, is the 

71 proportion of Hi that is inherited from a particular source population. For 

72 notational simplicity, we only consider gene-flow only from one population 

73 into another. We will later discuss how to extend this model to multiple ad- 

74 mixing source populations. We use the same mechanistic admixture model 

75 of Verdu and Rosenberg (2011), and will use its notation where possible. 

76 Finally, we use the random union of zygotes model, with a diploid popula- 

77 tion size of N (2N chromosomes), for genetic drift and recombination, and 

78 assume a sample size of n chromosomes from a single population. 

79 In this model, a hybrid population of N diploid individuals forms in gen- 

80 eration 1 from two previously isolated source populations. In this first 
si generation, individuals in the hybrid population are from the first source 

82 population with probability sq or from the second source population with 

83 probability 1 — so- In generation g + 1, each chromosome is, independently, 

84 from the first source population with introgression probability s g , or from 

85 the hybrid population with probability 1 — s g . Chromosomes inherited from 

86 the hybrid population are the product of the recombination of the two chro- 

87 mosomes of one individual (zygote), chosen uniformly at random. Finally, 
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88 these 2N chromosomes are paired up to form the N individuals in generation 

89 5+1. 

90 Finally, we let the stochastic process A(£) represent the local ancestry 

91 along a chromosome as a function of £, the physical position: 

I 0 : £ is descended from first source population 
1 1 : i is descended from second source population 

92 The fraction of the chromosome descended from the second source popu- 

93 lation is given by 

H = \ [ A(£)d£, 
L Jo 

94 where L is the total length of the chromosome. 

95 Assume that g generations after the start of admixture we have randomly 

96 sampled n chromosomes from the hybrid population and determined their 

97 corresponding admixture fractions, Hu g \, H 2 r g \, . . . , H n r g \. We are inter- 

98 ested in the joint distribution of these n random variables. When n = 1 

99 and as L — > oo, this is the admixture fraction considered by Verdu and 

100 Rosenberg (2011). 

101 Because the n chromosomes have possibly overlapping geneologies, the 

102 admixture fractions are not independent. However, the joint distribution 

103 of the admixture fractions does not depend on their ordering, so they are 

104 exchangeable. As a result, they can be viewed as being identically and 

105 independently (iid) drawn from a random distribution Q. This random 

106 distribution can be interpreted as a function of the random genealogy of 

107 the entire hybrid population up to g generations in the past. When g is 

108 small, the genealogies of the n samples will be unlikely to differ from n non- 
109 overlapping binary trees, so Q will be approximately constant. If g is large 

110 however, these genealogies are likely to overlap, and this will no longer be 

111 true. 

112 Verdu and Rosenberg (2011) focus on moments of m particular on 

113 the mean and variance. However, because the admixture fractions are not 

114 independent, even as n — >• 00, the sample mean and sample variance will 

115 converge to the mean and variance of Q, which are random quantities. For 

116 example, 
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1 

E(H 1{g) ) + E(H 1(g) \M) = lim - £ H i{g) 



i=l 




) 



2 



(H 1(g) ) ^vax(H l{g) \M)= lim 



1 



var 



n->oo n — 1 



ii7 and similarly for higher-order moments. The moments of the admixture 

us factions have two components: randomness from sampling the population 

119 genealogy, and randomness from the sampling of chromosomes. The ex- 

120 pressions to left account for both, while the expressions to the right only 

121 account for the latter. Variances among individuals within one popula- 

122 tion correspond to vav(Hi^\Q), while variances over replicate populations 

123 correspond to vaic(H 1 ^). This latter value will be larger than the expected 

124 sample variance calculated from multiple individuals sampled from the same 

125 population, and will rarely be useful for inference purposes. 

126 In the following sections, we will show how the constants on the left-hand 

127 side, as well as expectations of the random variables on the right-hand side, 

128 can be derived for mechanistic models of introgression. By comparing these 

129 expectations to the observed admixture parameters from a sample, we will 

130 be able to construct a method of moments estimator for the parameters of 

131 the model. 

132 Let k\ be the sample mean: 



133 We can express its expectation in terms of the 1-point correlation function 

134 of A: 



135 Similarly, let /c2 be the unbiased estimator of the sample variance: 




i=l 



E(h) = E(H 1{g) ) 




F{A 1{g) (0) = l}. 



Downloaded from http://biorxiv.org/on September 18, 2014 



MASON LIANG AND RASMUS NIELSEN 



1 n 



i=l 

136 Its expectation is given by 



i=l 



= E(Hl g )-K(H 1 , g H 2 , g ). 

137 These expectations can be written in terms of two-point correlation func- 

138 tions of A: 

E (^i%)) = I2 E {£ A Hg) Wde£ A 1{g) (£)dA 



1 

1 



' L E{A 1{g) (£)A 1{g) (£'))d£di' 



o Jo 

L r L 



{A 1{g) (£) = l,A 1{g) (£ , ) = l}d£d£'. 



139 Similarly, 



E(H 1(g) H 2{g) ) = ± I J L j\{A 1{g) (t) = l,A 2{g) (£') = l}d£d£'. 
140 Writing these two correlation functions as 



V 2( 9 ) 

141 we find that 



'{A 1{g) (£) = l,A 1(g) (£') = l} 
•{A 1(g) (£) = l,A 2(g) (0 = 1} 



(1) E(k 2 ) = -L£ £ (i -1 ) v%) d^'. 



'o Jo 

142 In general, the i th /c-statistic is an unbiased estimator of the i th cumulant 

143 of Q, and its expectation can be written as an integral over [0, L\ l of a linear 

144 combinations of z-point correlation functions. For example, 
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4 times 3 times 6 times 



1 -1 -1 -1 2 v 3[g) d£d£'d£' 



145 Remarkably, the linear combinations required to compute the expecta- 

146 tions of the fc-statistics correspond exactly to the higher-order disequilibria 

147 as defined by Bennett (1952). Furthermore, if instead the we choose to 

148 compute the expectations of the /i-statistics, which estimate the central 

149 moments, the linear combinations would correspond to the higher-order dis- 

150 equilibria as defined by Slatkin (1972). 

151 We next find the recurrence relations these correlation functions satisfy 

152 and solve them in the some special cases. In particular we will consider the 

153 case of a single admixture event g generations ago and the case of constant 

154 gene-flow starting g generations ago. 



155 A Single Admixture Event. We start with a simple case, where intro- 

156 gression only occurs in the founding generation, i.e. s g = 0 for g > 0. Using 

157 the random union of zygotes model, we can compute v 2 ( 9 ) in terms of the 

158 probabilities from the previous generation: 

159 If two sites at t and £' are on the same chromosome in generation g + 1, 

160 then they were inherited from one chromosome from generation g with prob- 

161 ability [££'] and from two chromosomes from generation g with probability 

162 [£{£']■ If they are on different chromosomes, then the probability that they 

163 are descended from one chromosome in generation g is ^ [££'] and the prob- 

164 ability that they are descended from two chromosomes is ^[•^'j + (l ~~ 277 ) 

165 In matrix notation, 



166 



V 2(g+1) = (L2U 2 )V 2(9 ) = (L 2 U 2 ) 9 V 2 ( 0 ),, 

where the the recombination and drift matrices are given by 
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U 2 



0 1 



167 This is the the same matrix equation (Wright 1933 and Hill and Robertson 

168 1966) derived for the decay of two-locus linkage disequilibrium. The 'alleles' 

169 we consider are the local ancestry at t and £'. To the extent possible, our 

170 notation will follow (Hill 1974), whose results for measures of multi- locus 

171 linkage disequilibria we use. The matrices L2 and U2 share (1 — 1) as a 

172 left-eigenvector, with corresponding eigenvalues 1 — ^ and [if]. As a result, 

E(b) = X2 / / ( 1 - 1 ) ■ (UVtf v m dl<W 

(2) -v^-w)' fa"*- 

173 For a model using the Haldane map function, [£\£'} = 1 ~ exp ^~ 2 ^~^ D ; this 

174 equation becomes 



L 2 ' 

175 while for a model of complete crossover inteference on a chromosome of 

176 length 1 Morgan, we can get a closed form solution: 



= (l - (so - si) f £ (l-\£- £'\Y 



1 



2\ 



1 " ( s 0 " s 0 



2N J v u uy 2 + g 

177 For predicting the expected sample variance, the difference between these 

178 two models is not large, as shown in figure 4. For the simulations and 

179 inference in this paper, we will ignore crossover interference, and use the 

180 Haldane map function. However, none of the mathematical results of this 

181 paper will require this assumption. 
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182 For computing higher-order correlation functions, we find a similar equa- 

183 tion 

(3) v i(g) = (LiUi) 9 v i(0 ). 

184 Bennett's coefficients for higher-order linkage are left-eigenvectors of the 

185 recombination matrix Uj. For i = 3, it is also a left-eigenvector of the drift 

186 matrix, so we immediately get that 

E(fe , = a ° (i -*°r-'°> (i - ±y (i - i)'^^. 

187 For i > 4, this is no longer true, but the results of (Hill, 1974) can be 

188 used to compute Vi(g) without having to exponentiate the entire drift and 

189 recombination matrices. For example, for k^, the drift and recombination 

190 matrices are 15 x 15, but using the technique in (Hill, 1974), we only need 

191 to exponentiate a 4 x 4 matrix to compute W,(k^). 

192 Varying Migration. If s g > 0 for s > 1, we obtain a modified version of 

193 Equation 3: 

(4) Vj( 9 ) = L l D i ( 9 )U i v i ( 9 _ 1 ), 

194 where the diagonal matrix ~Di( g ) has entries giving the probabilities the 

195 set of chromosomes, p, in a correlation function are all from the hybrid 

196 population in the previous generation: 

dp,p(g) = (1 — s g)^- 

197 Note that if is fixed, then equation (4) is linear, and can be solved 

198 using a Laplace transform. 

199 Inference of admixture times 

200 The equations in the previous section can be used to develop a method 

201 of moments-estimators for admixture parameters by numerically solving the 

202 admixture parameters in terms of the expectations for the /c-statistics. Sub- 

203 stituting in the observed values for the /c-statistics gives estimates for the 

204 admixture parameter (s). 
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205 However, with real data, we only have estimates of the admixture frac- 

206 tions, so some of the variability seen in the distribution of admixture frac- 

207 tions will be due to estimation variability. To account for this, we assume 

208 that the estimations errors are additive and iid: 

Hi(g) = H i{g) + a. 

209 Because cumulants are additive, 



E{k n )=E(K n (H i(g) + e i \g)) 

= E (K, n (H^ g )\Q)) + n n (ei). 

210 The expectations we have computed are just the term of this sum. To correct 

211 for the variability in the estimates, we need to subtract off the second term. 

212 We use a block bootstrap to estimate these effects. 

213 One additional complication arises in dealing with genotyping data. We 

214 have assumed that we have the ancestry fractions for each haplotype in the 

215 sample, but with genotyping data, we instead have their pairwise means: 

216 (Hi( g j + H 2 ( g })/2. . . . This is results in a decrease in the expectations of 

217 the /c-statitics. Conditional on the random distribution Q, H 1 ^,H 2 {g)- l ■ ■ ■ 

218 are iid drawn from Q. Cumulants are additive, so we use the law of total 

219 expectation to find that 

H 1{g) + H 2[g) \ ^ e / fH l[g) + H 2{g) 



2 

Q ) +K. 



G 



2(5) 



2 

= 2- i+1 E( Ki (H 1{g) \g)) 

= 2" l+l Ki (H 1{g) ) . 

220 Comparison to Verdu and Rosenberg. The recursion equations given 

221 by Verdu and Rosenberg (2011) are different from the ones we have derived. 

222 This is partly because we have accounted for the effects of genetic drift and 

223 recombination, but also because we are computing the moments of slightly 

224 different quantities. 

225 In figure 2, we have shown the admixture fractions for five replicate pop- 

226 ulations 5, 50, and 500 generations after an admixture pulse. The variance 

227 that (Verdu and Rosenberg, 2011) compute variance over all the replicate 
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228 populations, while the variance we have computed in this paper is the ex- 

229 pectation of the variance within a single population. When g is small, these 

230 similar, but when g is large, the variance within a population goes to zero, 

231 but the variance across the replicate populations does not. This effect is 

232 shown in Figure 3. Initially, both quantities decline exponentially in g, but 

233 after 2 9 > nLg , the variance we predict begins to decline linearly instead. 

234 This is because variance is inversely proportional to the number of genetic 

235 ancestors of the sample. When g is small, the number of genetic ancestors 

236 is approximately 2 g . However, the approximate number of recombination 

237 events in the sample is approximately bounded by nLg, so when this quan- 

238 tity is smaller than 2 9 , it provides a better approximation for the number 

239 of genetic ancestors. In this regime, the variance will decline linearly in g. 

240 It is also possible to compute the variance over all population replicates 

241 under our model, which allows a direct comparison to Verdu and Rosenberg 

242 (2011). In the case of one pulse of admixture, we can now solve equations 1 

243 for F{A 1>g (e) = l,A hg {l') = 1} to get 

v a iiH 1[g) ) = E(Hl g )-s 2 0 

(5) = p (so ~ si) £ £ 1 

244 This variance and the expectation of the second fc-statistic have the same 

245 limit as N — > oo, but for finite N, the variance is larger. This is because 

var(H 1{g) ) = var [E(H 1{g) \G)] +E [var(H 1{g) \g)] = varfa] + E[fc 2 ]. 

246 The first variance is small when N is large, but is always non-negative. 

247 The difference between this equation and equation 1 only becomes significant 

248 on a coalescent time scale. In the absence of genetic drift, the admixture 

249 fractions are approximately independent, becuase the samples do not share 

250 ancestors. 

251 Application to African American Data. We applied this method to a 

252 subset of the ASW, CEU, and YRI data from the HapMap 3 project (Con- 

253 sortium et al., 2010). After excluding children from trios, there were the 

254 genotypes for 49 ASW, 113 YRI, and 112 CEU individuals. We estimated 



(X-M 



si 



= i} dm' 

i - [ur (i - 

1 - [W] (1 - i ) 



dm. 
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255 the admixture fractions using the supervised learning mode of Admixture, 

256 with the CEU and YRI individuals assigned to separate clusters. The sam- 

257 pling distribution of the admixture fractions was estimated using the block 

258 bootstrap with 10 4 replicates and 2678 blocks, giving a block size of approx- 

259 imately 10 CM. The admixture fractions for the 49 ASW samples are shown 

260 in Figure 1 and the observed fc-statistics are given in table 6. 

261 We assumed a 3-parameter model of constant admixture. For g s tart < 

262 g < g s top, Sg = s with s g = 0 elsewhere. By matching the block-bootstrap 

263 corrected k% and k^ to the predictions of equation 1, we obtained a point 

264 estimates of 

S = 0.0277 

(Jstart — 2 

gstop =11. 

265 We obtained confidence intervals, shown in Figure 5, by simulation. For 

266 each cell in the grid, we simulated 10 3 replicates under the corresponding 

267 g s tart and g s top, with s = 1 — k\ 9stov 9start+l \ p or each replicate, we com- 

268 puted the &2, &3, and k& statistics. A cell was then included in the confidence 

269 interval if and only if the corrected &2, k%, and foj statistics from the HapMap 

270 data fall inside a centered interval containing 98.7% of the probability mass 

271 of the simulated distribution. This mass was chosen so that under the Bon- 

272 ferroni correction for three tests, there is at least a 95% chance of including 

273 the true parameter values in the confidence region. 

274 The point estimates for g s tart and g s top correspond to the values for which 

275 the observed /c-statistics are closest to their simulated medians. 

276 Discussion 

277 We have extended the mechanistic model of Verdu and Rosenberg (2011) 

278 to account for recombination and genetic drift. Doing so allows us to apply 

279 the predictions of this model to data. This mechanistic model allows for a 

280 large number of parameters. For the purposes of inference, it seems that 

281 imposing constraints, i.e. a small number of pulses or constant admixture, 

282 will be needed to narrow the search space. 

283 In this paper, we have assumed that admixture only comes from one 

284 source population, this need not be the case. To account for admixture 

285 from multiple source populations, equation 1 must be modified to account 
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286 for the probability that haplotypes trace their descent to multiple source 

287 populations. Algorithmically, this is feasible, but the notation is cumber- 

288 some. The resulting equations are given in the appendix, along with the 

289 equations for computing expectations of higher-order fc-statistics. 

290 Applications of the method to African-American HapMap data provides 

291 estimates of the time since admixture between people of Europe and and 

292 African descent in America. Notice that the confidence set for the admix- 

293 ture parameters does not include values of g s top = 0. We interpret this as 

294 evidence that admixture rates have declined the last few generations. The 

295 point estimate of time gene-flow stopped is g s top = 2. This probably reflects 

296 a more gradual reduction in gene-flow within the last 5 generations or so, 

297 rather than a discrete stop in gene-flow 2 generations ago. The discreteness 

298 is enforced by the model. Also notice that admixture before 15 generations 

299 ago can be rejected. With a generation time of 25-30 years, this corresponds 

300 to 325-400 years, and is in good accordance with the historical record. The 

301 point estimate of the time of first admixture is 11 generations, or approx. 

302 275-330 years ago. 

303 Structure analyses have become one of the most commonly applied tools 

304 in population genomic analyses. The theory developed in this paper allows 

305 users of structure analyses to interpret their data in the context of a model of 

306 admixture between populations, and should find use in many studies aimed 

307 at understanding the history of populations. 
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African American individuals 



Figure 1. Admixture fractions for 49 African American in- 
dividuals in the HapMap 3 data. Source population allele 
frequencies were estimated using 113 Yoruban and 111 Eu- 
ropean individuals. 
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Figure 2. The admixture fractions of five replicate popu- 
lations (each column) 5, 50, and 500 generations after an 
admixture pulse. As the admixture event grows more an- 
cient, the variability within a replicate population decreases, 
but some variability is still maintained across the popula- 
tions. 
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Variance 




g 



5 10 15 



Figure 3. The variance predicted by Verdu and Rosenberg 
(2011) and equation 5, plotted on a logarithmic scale. The 
variance we predict (red) is always larger, but the two a very 
similar when g is small. 



Downloaded from http://biorxiv.org/on September 18, 2014 



UNDERSTANDING ADMIXTURE FRACTIONS 



LI") 



E(k 2 ) 
1.00 - 

0.50 

0.20 
0.10 
0.05 



20 



40 



60 



80 



Figure 4. The expected sample variance given by equation 
1 plotted on a logarithmic scale, for a three different map 
functions. We used a map distance of L = 1 Morgan and 
N = 10 4 . The Haldane map function (1/2 - e~ 2x /2) is in 
red, the Kosambi map function (tanh(2x)/2) is in yellow, 
and the complete inference map function (x) is in blue. For 
all values of g, the expectations are ordered in the same order 
as the map functions, but the difference between the three 
disappears by g = 100. 
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Figure 5. 95% confidence region for a model with constant 
admixture from generations g start to g s top- The point esti- 
mate of g start = 11 and g s top = 2 generations ago is colored 
green. 
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Figure 6. fc-statistics 



363 Appendix 

364 These are the matrices for computing ~E{k%). The matrices for computing 

365 E(&4) are 15 x 15 and not given here, but can be found in (Hill, 1974). 
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366 When there is migration from both source populations, the recursion re- 

367 lations for the i-point correlation functions will depend on i — 1-point, i — 2- 

368 point, . . . correlations functions as well. As as example, consider the case of 

369 v 2(g)- Let the introgression probability from the second source population 

370 be given by t g . The recursion equation for v 2 ( s ) now also depends on vu g )- 
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/ l-Sg-tg 0 \ / tg 

V2(9+1) " ^ { 0 (1 - s g - t g ? ) V ^ + U + 2t 9 nA Hg) (i) = 1} 

= L2 ( 1 ~ 'o ~ h a - g 2 ) U2V2(3) + ( t\ + £v iw ) ' 

37i Similarly, the recursion equation for v 3 ( ff ) depends on v 2 ( s ) and Vi( g \. 



